The data were collected during a study of the settlement pattern of common terns on a small islet in the Delta d’Ebre (Hernandez and Ruiz, range3), particularly in the mouths of the Ebre river. The islet was inspected at two-day intervals throughout the range0 breeding season. The data include the location of each nest, its elevation above sea level, and elevations at a number of additional points (points without nest) on the islet. In the file called elevationsIslet.txt, contains the information of the coordinates and elevation above sea, and in file, called poly84.txt contains the coordinates of the borders of the islet. The aim is to predict the elevation above sea level along the small islet using a kriging interpolation.
Firstly, we plot the locations to see the spatial distribution of the data
We also look at the distribution in relation to the x-coordinates (E-W) and y-coordinates(N-S).
The plots show a concentration of high values in the extreme east and west of the islet and we can observe a grater density in the north.
The process doesn’t seem stationary, indeed there is a clear quadratic trend along the x direction. In addition we try using the y direction as regressor, to find a possible linear or quadratic trend.
##
## Call:
## lm(formula = data ~ x + y + I(x^2) + I(y^2), data = dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.639 -6.417 -0.964 6.024 22.744
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.4502678 2.8429177 6.138 7.78e-09 ***
## x -0.4938096 0.1103379 -4.475 1.55e-05 ***
## y 0.0349038 0.0570014 0.612 0.541
## I(x^2) 0.0040706 0.0009267 4.393 2.17e-05 ***
## I(y^2) 0.0019508 0.0008841 2.206 0.029 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.19 on 143 degrees of freedom
## Multiple R-squared: 0.2026, Adjusted R-squared: 0.1803
## F-statistic: 9.082 on 4 and 143 DF, p-value: 1.446e-06
The linear term in y doesn’t seem significant, so we remove it.
Now we have obtained a model in which all regressors seem to be significant. We save the residuals of our linear model and look at the de-trended data:
What we now obtain is a more homogeneous distribution of the data around the value 0.
We don’t notice any particular trend so this new dataset seems stationary!
We try to analyse the spatial dependence through a Variogram Cloud, using the Robust Estimators (since the classical one might consider outliers those values which are not necessarily outliers)
## variog: computing omnidirectional variogram
The variability at small distances appears a bit greater than that for larger distances. => We reduce the density of the plot by reducing the maximum distance over which the variance are calculate.
## variog: computing omnidirectional variogram
## variog: computing omnidirectional variogram
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
=> All the bins have at least pairs.min=30 observations each, indeed they have:
## [1] 422 650 731 755 660 514 460 473 497 447 457 415 377
Empirical Variogram
## variog: computing omnidirectional variogram
In the variogram of the residuals the values increases until certain distance, and then they keep constant. That’s ok since it’s the behaviour that we expect from a stationary variogram.
To check the hypothesis of the spatial independence we use a Monte Carlo approach
## variog.env: generating 3000 simulations by permutating data values
## variog.env: computing the empirical variogram for the 3000 simulations
## variog.env: computing the envelops
All the values of the empirical variogram are inside the envelope, therefore the process has no spatial dependence.
To check the anisotropy we need to compute the directional variogram in the 4 main directions: 0º, 45º,90º and 135º.
## variog: computing variogram for direction = 0 degrees (0 radians)
## tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 45 degrees (0.785 radians)
## tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 90 degrees (1.571 radians)
## tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing variogram for direction = 135 degrees (2.356 radians)
## tolerance angle = 22.5 degrees (0.393 radians)
## variog: computing omnidirectional variogram
The directional variograms don’t seem to be perfectly overlapping: they might have the a different sill, in particular the variogram seems to have a higher value along the 90° direction => Geometrical anistropy
Let’s also analyse the range observing the Rose Diagram:
## variog: computing variogram for direction = 45 degrees (0.785 radians)
## tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 90 degrees (1.571 radians)
## tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 135 degrees (2.356 radians)
## tolerance angle = 45 degrees (0.785 radians)
## variog: computing variogram for direction = 180 degrees (3.142 radians)
## tolerance angle = 45 degrees (0.785 radians)
We can observe that the Rose diagram is more or less a circle. At different directions, the variance is reached more or less at the same range. Then, this process is isotropic.
OPPURE
We notice that the Rose diagram is not perfectly circular: it is slightly elliptical, with major range in the 45 direction => Zonal anistropy.
=> In conclusion we have a combined anisotropy, so we can’t make the assumption of isotropic process, which is necessary for the correct use of the kriging techniques, since the theoretical variograms used for krigring are based on isotropic models.
## variog: computing omnidirectional variogram
We’ll use a Maximum Likelihood approach
## likfit: estimated model parameters:
## beta tausq sigmasq phi
## "-2.631" "34.598" "71.657" " 6.083"
## Practical Range with cor=0.05 for asymptotic range: 18.2225
##
## likfit: maximised log-likelihood = -531.4
## likfit: estimated model parameters:
## beta tausq sigmasq phi
## "-2.786" "41.157" "66.712" " 5.601"
## Practical Range with cor=0.05 for asymptotic range: 9.693811
##
## likfit: maximised log-likelihood = -529.8
## likfit: estimated model parameters:
## beta tausq sigmasq phi
## "-2.758" "36.363" "69.395" "12.992"
## Practical Range with cor=0.05 for asymptotic range: 12.99251
##
## likfit: maximised log-likelihood = -530.3
## likfit: estimated model parameters:
## beta tausq sigmasq phi kappa
## "-2.7801" "40.9846" "66.7781" " 0.5636" "25.1228"
## Practical Range with cor=0.05 for asymptotic range: 9.875097
##
## likfit: maximised log-likelihood = -529.8
All the simulated variograms contain the empirical variogram so we don’t have evidence to exclude any model.
Let’s compare the different modelsThe two fitted models with the highest loglikelihood are the Matern model and the Gaussian model. So we’ll use these two to perform the kriging prediction step.
First werate a grid where we’ll perform our kriging predicitons
Let’s try using the gaussian model to do our prediction step:
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
## xvalid: number of data locations = 148
## xvalid: number of validation locations = 148
## xvalid: performing cross-validation at location ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,
## xvalid: end of cross-validation
## [1] "data" "predicted" "krige.var" "error" "std.error" "prob"
## krige.conv: model with constant mean
## krige.conv: Kriging performed using global neighbourhood
## xvalid: number of data locations = 148
## xvalid: number of validation locations = 148
## xvalid: performing cross-validation at location ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,
## xvalid: end of cross-validation